Racial discrimination continues to be pervasive in cultures throughout the world. Researchers examined the level of racial discrimination in the United States labor market by randomly assigning identical résumés black-sounding or white-sounding names and observing the impact on requests for interviews from employers.
In the dataset provided, each row represents a resume. The 'race' column has two values, 'b' and 'w', indicating black-sounding and white-sounding. The column 'call' has two values, 1 and 0, indicating whether the resume received a call from employers or not.
Note that the 'b' and 'w' values in race are assigned randomly to the resumes.
Perform a statistical analysis to establish whether race has a significant impact on the rate of callbacks for resumes.
In [2]:
import pandas as pd
import numpy as np
from scipy import stats
In [3]:
data = pd.io.stata.read_stata('data/us_job_market_discrimination.dta')
In [4]:
# number of callbacks for balck-sounding names
sum(data[data.race=='b'].call)
Out[4]:
You can include written notes in notebook cells using Markdown:
In [ ]: